翻訳と辞書
Words near each other
・ Non-standard RAID levels
・ Non-military armored vehicle
・ Non-minimally coupled inflation
・ Non-ministerial government department
・ Non-modulating essential hypertension
・ Non-Molestation Order
・ Non-monetary economy
・ Non-monogamy
・ Non-monotonic logic
・ Non-motorized access on freeways
・ Non-Muslim interactants with Muslims during Muhammad's era
・ Non-mycosis fungoides CD30− cutaneous large T-cell lymphoma
・ Non-narrative film
・ Non-national
・ Non-native pronunciations of English
Non-native speech database
・ Non-negative least squares
・ Non-negative matrix factorization
・ Non-neutral plasmas
・ Non-Newtonian fluid
・ Non-NFL Redskins sports teams
・ Non-no
・ Non-noradrenergic, non-cholinergic transmitter
・ Non-Nuclear Aggression Agreement
・ Non-Nuclear Futures
・ Non-nucleophilic base
・ Non-official cover
・ Non-operating income
・ Non-orientable wormhole
・ Non-overlapping magisteria


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Non-native speech database : ウィキペディア英語版
Non-native speech database

A non-native speech database is a speech database of non-native pronunciations of English. Such databases are essential for the ongoing development of multilingual automatic speech recognition systems, text to speech systems, pronunciation trainers or even fully featured second language learning systems. Because of the comparably small size of the databases, however, many of them are not available through the common distributors of speech databases. This leads to the fact that it is hard for researchers in speech recognition to keep an overview of what kind of databases have already been collected, and for what purposes there are still no collections.
This article is based on a paper from the ASRU speech conference. 〔M. Raab, R. Gruhn and E. Noeth, ''Non-Native speech databases'', in Proc. ASRU, Kyoto, Japan, 2007.〕 The paper wanted to provide a useful resource regarding the issue above. This online article is intended to provide a place where information about non-native speech databases can be updated continuously by the speech research community.
==Legend==
In the table of non-native databases some abbreviations for language names are used. They are listed in Table 1. Table 2 gives the following information about each corpus: The name of the corpus, the institution where the corpus can be obtained, or at least further information should be available, the language which was actually spoken by the speakers, the number of speakers, the native language of the speakers, the total amount of non-native utterances the corpus contains, the duration in hours of the non-native part, the date of the first public reference to this corpus, some free text highlighting special aspects of this database and a reference to another publication. The reference in the last field is in most cases to the paper which is especially devoted to describe this corpus by the original collectors. In some cases it was not possible to identify such a paper. In these cases a paper is referenced which is using this corpus is.
Some entries are left blank and others are marked with unknown. The difference here is that blank entries refer to attributes where the value is just not known. Unknown entries, however, indicate that no information about this attribute is available in the database itself. As an example, in the Jupiter weather database〔K. Livescu, ''Analysis and modeling of non-native speech for automatic speech recognition'', M.S. thesis, Massachusetts Institute of Technology, Cambridge, MA, 1999.〕 no information about the origin of the speakers is given. Therefore this data would be less useful for verifying accent detection or similar issues.
Where possible, the name is a standard name of the corpus, for some of the smaller corpora, however, there was no established name and hence an identifier had to be created. In such cases, a combination of the institution and the collector of the database is used.
In the case where the databases contain native and non-native speech, only attributes of the non-native part of the corpus are listed. Most of the corpora are collections of read speech. If the corpus instead consists either partly or completely of spontaneous utterances, this is mentioned in the Specials column.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Non-native speech database」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.